A Case for the Global Access to Large Distributed Data Sets Using Data Webs Employing Photonic Data Services
نویسندگان
چکیده
We argue that data webs employing specialized path services, network protocols, and data protocols can be an effective platform to analyze and access millions of distributed Gigabyte (and larger) size data sets. We have built a prototype of such a data web today and demonstrated that it can effectively access, analyze and mine distributed Gigabyte size data sets even over thousands of miles by using specialized network and data protocols. The prototype uses a server which employs the DataSpace Transfer Protocol or DSTP. Our assumption is that WSDL/SOAP/UDDI-based discovery and description services will enable this same infrastructure to scale to millions of such DSTPServers.
منابع مشابه
Data webs for earth science data
We describe high performance data webs for earth science data which are designed for interactively analyzing small to moderate size remote data sets, as well as mining distributed data sets. Achieving high performance required developing specialized high performance transport services as well as specialized high performance middleware services for merging multiple data streams. Data webs comple...
متن کاملAccess control in ultra-large-scale systems using a data-centric middleware
The primary characteristic of an Ultra-Large-Scale (ULS) system is ultra-large size on any related dimension. A ULS system is generally considered as a system-of-systems with heterogeneous nodes and autonomous domains. As the size of a system-of-systems grows, and interoperability demand between sub-systems is increased, achieving more scalable and dynamic access control system becomes an im...
متن کاملE2DR: Energy Efficient Data Replication in Data Grid
Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...
متن کاملAn Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity
The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...
متن کاملSolubility Prediction of Drugs in Supercritical Carbon Dioxide Using Artificial Neural Network
The descriptors computed by HyperChem® software were employed to represent the solubility of 40 drug molecules in supercritical carbon dioxide using an artificial neural network with the architecture of 15-4-1. The accuracy of the proposed method was evaluated by computing average of absolute error (AE) of calculated and experimental logarithm of solubilities. The AE (±SD) of data sets was 0.4 ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003